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(54) Abstract Title 

Remotely assessing which of the software modules instated in a server are active 

(57) In a distributed data processing system there are plural application servers each having a database of 
executable data management tasks and an initialisation list indicating which of these tasks should be active. A 
server can be probed to determine whether it has successfully initialised all the tasks in its initialisation list. 
The probed server can be instructed to initialise tasks that have failed to become active. Monitoring of network 
messages and integrity of e-mail routes are also disclosed, as is checking the replication of database changes 
between servers. 
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SERVER PROBE METHOD AND APPARATUS FOR A DISTRIBUTED DATA 

PROCESSING SYSTEM 

The present invention relates to a method and apparatus for probing 
5 remote application servers in a distributed data processing system. 

Some conventional data processing environments comprise a plurality 
of user terminals connected to a central host data processing system. 
Such data processing environments are typically referred as central or 
10 host environments. 

: Increasing in popularity are distributed data processing 

environments in which user terminals are connected to plural server data 
processing systems. 

15 ; 

, In *»th of the above examples, the cost of systems management can 

j be^measured by the r^tf.o of administrates (or operation support staff) 
: to users . . In a: typical .distributed ewirt^tmT^t"; such as an environment 
!% providing a Lotus ; Notes sejrvice or f|:milar distributed client-server 
20 1 database application, the ratio is relatively! high: one Lotus Notes 

r ...... jfLotus- and Lotys Na;es are tirade jnarks ; of Lotus ^Development Corporation) 

: - .Administrator &ajf .have dii".fiicul^^ 200 users of a fully 

functional Lotus Notes service. By comparison; in a typical host 
environment such as an Of f iceVision (Of f iceVision is a trade mark of 
25 International Business Machines Corporation), a single administrator may 

comfortably control thousands of users. 

In a typical distributed environment employing a distributed 
database management system, a group of administrators collectively 
perform operational tasks associated with management of servers such as 
Groupware and E mail servers. Both E Mail and Groupware applications 
usually generate megabytes of information during normal daily operation. 
The information is typically stored in a log format. The logs are 
preferably processed with a view to identifying error conditions and thus 
to eliminating or at least reducing application server failures. However, 
the processing of such logs is a laborious activity. It would be 
therefore be desirable to improve automation of server management in a 
distributed environment. 



In accordance with the present invention, there is now provided 
server probe apparatus for a distributed data processing system having 



pidfal application' served computer systems interconnected via a network, 
each application server having a database application including a 
plurality of executable data management tasks and an initialisation table 
storing a task configuration indicative of which ones of the tasks should 
be* active/* the -apparatus comprising:' means for reading the task 
configuration from the initialisation table from a target one of the 
-'apulica'ci'on servers ,- means for identifying the tasks which are active in 
the target one of the application servers; and means for generating an 
event message if active tasks identified differ from the tasks specified 
in "the' "initialisation 1 tab'le. ' 

The generating means preferably comprises restart means for making 
successive attempts to restart tasks which are specified in the 
initialisation table but inactive in the target one of the application 
servers and means for generating the event message after a predetermined 
plurality" cf failed -attempts by the restart means to restart, the or each 
inactive task. ' r ''■ '■ 5 •'• ■ ' ; 

:> »3i--;: £h -p*K.ot£ eraboiHmeu.'d : or-2ne 'present 'invention, there is 
additionally provided an administration terminal' having means for 
displaying the even'c messages. 

St will be appreciatec that' the present invention extends to a 
distributed da,:a processing system having plural application server 
-computer.. systems interconnected' v^ a" network, %al-h application server 
having a database application including a plurality of executable data 
management, tasks and an-initxalfsation table storing a task configuration 
indicative of which ones •«,£ the tasks should be active,' and server probe 
apparatus as hereinbefore described. 

Viewing the present invention, from another aspect/ 'there is now 
provided a method for managing data management tasks in a distributed 
data processing system having plural application server computer systems 
interconnected via a network, each' application server having a database 
application including a plurality of executable data management tasks' and 
an initialisation table storing a task configuration indicative of which 
ones of the tasks should be active; the method comprising the steps of: 
reading : *he task configuration from t"h^ initialisation table from a 
target, one of the- application ■ servers'^ identifying the 'tasks which are 
active in the target one of the application servers; and generating an 



event message if active tasks v identified^ differ -from, the, tasks specified 



in the initialisation table. 

i . : ■. . . -\ - .-' Iff- 



^ Preferred .embodiments of tjie pjcessot, invent ion, : -will now be 
described, by way of example, oftly, ( with.r^f^^nce,,to the. accompanying 
drawings, in which: 



Figure 1 is a block diagram of a, ^dist^i^ 
system; r . ^ : . , 0 ; ,, t . ; . .. 

Figure 2 is more detailed, block, diag^aati : c*f . the -da ta processing; 
system of Figure 1; 

Figure 3 is a block diagram 3 of a DSM -server of. ; the, , system shown in 
Figure 2; ; , 

, , .:: .. • 
Figure 4 is a block diagram^of an application server of the system 
shown in Figure 2; • . , . , , 

f - ^ *r.*ie ?q b *?S k i ^ i t?S aj ?^?^i a 3 ^WkhMv^J- architecture for. the 
DSM server; ■ . . r ^ .. . r . _ ■ , u * ac 

Figure 6 is a block diagram of software stored in an application 
* S c ^ rv ®T °i f the .^ vs ^ em <?k own in figure, 2;- ;i;w n 

rA.'^<-:<W: I- i'^ < ' M,>' V * M r, - » , M% 

.. : r f Fi^e 7 .is a.fimctipnal block-diagram^ of the,.DSM server. 

Figure 8 is .a block, <Jiagram^of a server, probe: function of the DSM 
server in the form^of a flowchart; 

Figure 9 is a block diagram of a mail probe function of the DSM 
server in the^ f orm of a flow chart;, and* 

.... Figu ? e 10 : . is a blpck diagram of . another distributed data processing 
€n ^ rorunent einb 9 d y4 n 9 the present invention. : 

, , ^f errin 9 first to Figure 1 # a .distributed' data processing system 
eic±>odying th^ .present invention comprises,va plurality of application 
{ S ^ e ^ compuX r^ , sys^e^ -40-70. and a Distribu&ed Systems Monitor (DSM 
server computer system 1{> all . interconnected via a network 5. 



15- . - 



With reference now' to Figure 2, each application server 40-70 
' provides a service to a - set of client user terminals 90-93 . The DSM 

( ; ........... ,,• ...... .j 

- server 10:~The DSM server 10 is also connected to an administration 
• ! ' terminal 30'. ''' ,-> '• • 1 

5 

' * ' '' " Referring to Figure 3, the DSM server' 10 comprises a system random 

access memory ' (RAM) 2 6'd , a" system read only memory (ROM) 2.10, a central 
^" ■ ' processing unit? (CPU) 220, a mass storage device 230 comprising one or 
■ ' •" more" largW capacity magnetic disks or similar data recording media, one 
ID; :.i or more 'r , 6noVa1)T« 'storage means 240 such as floppy disk drives, CD ROM 

• 1 drives' and "the iike/' a network adaptor 250, a keyboard adaptor 260, a 
' pointing' device Adaptor 270," and k display adaptor 280, all 

interconnected via a bus architecture 290. The CPU 220 is a Pentium 
: 100MHz central processor (Pentium is" a trade mark of Intel, Corporation) . 
It will be appreciated that other embodiments of ' the present invention 
may employ an equivalent to a Pentium 100MHz CPU to perform the function 
- ■■ y CPU-22^:>The-RAM 2<Kris ^ /least " 48 m^^^ A keyboard 

^■m&ia'boy&X&th Vhe" ^us^krchifcecture^290 via the keyboard adaptor 260. 
ASifc'iisriyVa iooxiitihg ^e'vicV.3lor'puc^ tablet, 
3 ^«*R*r'''^'lT or 1 the'-rilce-,'-' is^coupife' to* the W'aVchitectT>!re 290 via the 
"* ^^^^i^k^t^^i^^^ a display' c^tpuV; oevice 320, such 
K;:, '*~ J C ' ' :r£SC " ^thc^elay4ube"(CRTV c^splay; Uq^iid crys^l dismay (LCD) panel, 
or the like, is "coupled 'to the bus architecture 290 via the display 
adaptor 280. Additionally, the DSM server 10 is coupled to the terminal 
25v .-: sri , r2(^attd*the iirverii' 40-70 via' the ; network adaptor' 250 . 

si ■ - Basic input output system (BIOS) software is stored in the ROM 210 
; for enabling data communications between the CPU : 220, mass storage 230, 
•■ RAM 200, ROM 210, removable storage 240, and adaptors 250-280 via the bus 
30 architecture 290i Stored on the mass storage device 230 is .operating 

. .-system software and application software including DSM software. Further 
.. application software may be loaded into the DSM server 10 via the 
removable storage 240 or the network adaptor 280. The operating system 
software enables the DSM server 10 to select and run the_ application : , 
software. The application software stored in the DSM server 10 includes 
Lotus Notes Release 4. Lotus Notes 4 is a document -based database 
management system. Further details of Lotus Notes 4 can be found in 
Mastering Lotus Notes 4 by Brown. Brown, icnn tchouk and P^nwn. publish^ 

- in 1996 ^ SY*™. Tnr . As will oe described shortly, in operation, the 
40, .. ■,. . DSM server 10 employs Lotus Notis 4 to communicate with the application 
„ -. m . server- 40-70. * "* ; ' : - ' • 
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It will be appreciated that, in some embodiments of the present 
invention, terminal 20 may be integral to .the DSM server 10, with 
monitoring and control functions of the terminal 20 facilitated via the 
display 320 and the input devices 300 and 310 of. DStf^ server 10. 

Referring to Figure 4, each application server 40-70 comprises a 
system random access memory (RAM) 700,, a system read, P|l1v^ memory (ROM) 
710, a central processing unit (CPU) ,720, a mass., storage device 730 
comprising one or more large capacity magnetic, disks, or, similar data 
recording media, one or more removable storage me^ans 740. such as floppy 
disk drives, CD ROM drives and the like,^ a network adaptor 750, a 
keyboard adaptor 760, a pointing device .adaptor !770, and. a display 
adaptor 780, all interconnected via a bus architecture 790. The CPU 720 
may be an Intel Pentium 100MHz central processor or equivalent. A 
keyboard 800 is coupled to the bus architecture 790 via the keyboard 
'adaptor 760. Similarly, a pointing device . 810, : such as a mouse, touch 
screen, tablet, tracker ball or the 1 ike ,^ is , coupled to the bus 
architecture 790 via the pointing^ device ,adaptpr. .JIO.^ Equally, a display 
output device 820. such as a rathoflp r*\s hnho /rPTi -l*., k j 



application server 40-70 is coup^d, to ;; the D$M server 10 :2 $nd to remote 
client terminals 90 via the network adapt pr ; 750 ^ j j-j 0 

Basic input output system .(Bips) spftware is stored in the ROM 710 
for enabling data communications between the CPU 720, mass storage 730, 
RAM 700, ROM 710, removable storage 740, and adaptors 750-780 via the bus 
architecture 790. Stored on the mass storage -device 730 is operating 
system software and application software. The application software 
includes a distributed client-server database application such as Lotus 
Notes 4 and Lotus cc:Mail. In operation, each application server 40-70 
employs the resident client-seryer database application to communicate 
with both the remote client terminals 90-93 and the DSM server 10. 
Further application software may be loaded into each application server 
40-70 via the removable storage 740 or the network adaptor 780. In 
operation, the operating system software enables each application server 
40-70 to select and run the application software. 

deferring back to Figure 2, the application server 40 is a cc:Mail 
server running Lotus cc:Mail on the OS/2 operating system platform (OS/2 
is a trade mark of International Business Machines Corporation) produced 



4 



1 'iy International Business Machines Corporation to provide cc:Mail 
^- services -to' userV of the "connected ciient terminals 90. The application 
s6rVer -'"O'-'is a' Notes server 'running Lotus Notes 4 on the Windows Server 
. : - NT- operating system (Windows '-and Windows NT are trade marks of Microsoft. 
"I - Inc) ' produced by Microsoft ih'c to provide Notes services to users of the 
'''' «*»ae"eted> client terminals 1 91 1 The application server 60 is a Notes 
•- vt ; server running Lotus Notes'-4 on the OS/2 operating system to provide 

i Notes -services td Users of the connected client terminals 92. The 
. ^-. .application server 70 *s a Notes server running Lotus Notes 4 on the UNIX 
* or:Alx operating systems "(UNIX is licensed exclusively through X/Open 
^Cowpaft^.'L-ijfflited; AIX is ^rade v mark of International Business Machines 
:• Cooperation) -to provide'^ Notes 'services to users' of the ' connected client 
fl terminals 93. It will be 'appreciated that , in other embodinents of the 
... 1pxes<ixi€ invention, ' tii^re inay be more or less than four application - 
rl5,..o: • servers- Operating on W one or more of the aforementioned or different 

- ii - opei ; ting systeiii platforms. k ^ 
s.A) : .-• : . ' ' 94 •-•-.i': . r.c- r -. i -■■ - « ■ u: - ' ■ ; ' 
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afev Wtia appreci^^ trom ^igure" 1 ! ttiat the' DSM server is located 
.-.ii! terms *f- s^kt^^rixc^/^t^'i^ application servers 40-70 and 
s^he ;a dmni S ^ a tion^emi^i r 20. 'in 'operation the" DSM Server operates as a 
**dr=l<:vel,vSy. 6 tems manager 1 ' In operation, ' the application servers 40-70 
, ,* 2 c&vd de.ta- transfers ^i^^high they are involved, sSh as message and E 
•r..-., Mail to' or- fro^the- connected client terminals ''90-93'"' In log files. The 
, Y ; - log. f il*s. maintained by She application servers 40-70 are directed to the 
DSM. server, 10. The DSM server* 10 processes the received log files to 
reduce the amount of reporting information sent to the administration 
terminal 20. Provided that the application servers 40-70' "can. route such 
log files to the DSM server 10, and, in the case of Notes server 50-70, 
• employ the Notes communication protocol, the operating system platform' is 
not relevant!; 

Referring now to Figure 5, the high level architecture of the DSM 
server JC comprises a first layer 11 for performing Process. Action. 
Notify and Report functions. Below the first layer 11 is. a second 
function -layer 12 for performing Log". Analyze" and Filter functions. Below 
the second -layer 12 is a Lotus Notes layer 13. In operation, the Notes 
layor 13 enables, the' DSM serve? 1 Iff'W communicate with the application 
* ; rGrVerS 50 - 7i0 "' To ^='ilitate such' communication, the Notes, layer 13 
' in<?ludes ■*- Notes mail- message transfer' agent" (MTA) 14, a cc:Mail MTA 15, 



a Simple Message Transfer Protqco J ( SMTP ) ; inai 1 ( MTA l-6„ -and an X.400 mail 
MTA (not shown) . The message transfer agents avoid. th^-Jieed to include 
special mail gateways to communicate with, mail; system^ jwhich are foreign 
to Notes. Below the Notes f layer . is a,,networkj layer., Ij7 for interfacing 
the DSM monitor 10 with the application ^serve^S; 4gr70-. : .,,The mass storage 
230 comprises a Notes data st;ore 2f,.. .a cc iMail .store^vr. and an archive 
data store 23. Data from the application .to as STATREP , 

and LOG.NSF files . 110 or other mail .system log :-files s i00.; is received in 
the DSM server 10 at the network layer 17.And passed]; vifu the MTAs 14-16 
of the Notes layer 13 to the second^laye.r \2 */he.re. it , is processed by the 
Log, Analyze, and Filter f unctions .. The .Log , function .records incoming 
data in the mass storage 230. The. filtered 4ata isi,passed from the second 
layer 12 to the first layer 11 where the data is processed by the 
Process, Action, Notify and Report f^nct ions, . The. Action function may 
^ I ? erate ' . in res F° nse to the receiyed data, corrective instructions 3 1 
which are returned to the application server ,40.-7^0 . A Bopending on system 
configuration a delay may be imposed in the passage <:• data from the 
second layer 12 to the first layer 11 i^x^^ip^^y^pv more of the 
functions therein. For example, in some embossments ol the present 
S Ve ^| i °"', ^tPf,^. A^H^im^W *° aefe*v*te only once a week, 

^J^ a *g ?^°fi^:.5? ma f !>.^? 49?^.- < «>.* anss: storage 230 until 
_ a '..^ e P°^ , is , due .~_? he &?\ lev f 1 •*S <5 M*S ct *««s9« •ac^.-afc.the application 
servers^ 40-70 comprise's a mai^. layer . 41,,p £ pvidieg Notes; o- cc:Mail 
■fuifctionality on an OS/2, N/T .or AIX operating, system platform as the case 
^may be: Below f the mail layer. 41 is a i log a^/er 42. for supplying log'files 

the . DSM Ser T* r i- 10 - Bel ^ the layer 42 is- a network layer 43 for 
interfacing with the network layer 17 of the DSM server 10. 



Lotus Collection Age nt 



Referring to Figure 6, each Notes application server 50-70 runs 
Notes 910 on an operating system 900 such as AIX, OS/2, or NT operating 
system. In each Notes application server 50-70, Notes 910 comprises a 
N0TES.INI file 9.11 and a plurality of Notes tasks 913-916. The tasks 913- 
'916 include a Router task 913, a Replicator task 914, and a Reporter task 
915. In addition, each Notes .application server 5^)-70 includes a Notes 
collection agent 912. The Notes collection agent 912 operates as a task 
within Notes 910. The NOTES. INI file defines the tasks which are to be 
Started in Notes when,, the hos : t Notes application server 50-70 is 

booted. The Notes col lection. ; agent 912, is specif ied within the NOTES . INI 
file. Thus, the Notes collection agent 912 is. active. whenever the host 
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application 1 servers 50-70 "is operational. When active, the Notes 
collection 'agent 912' enables the DSM server 10 to coinmunicate with the 
Notes applications servers 50-70 via the Notes protocol. The Notes 
applicatidri servers 50-70 send their 1 respective operational statistics to 
a database -file called LOG.NSF. At definable intervals, the Notes 
collection agent 912 copies information from the LOG.NSF to an 
intermediate database, from which the information is formatted and mailed 
via the Notes layer 41 to the DSM server 10 as represented by 110 in 
Figure 5. The 1 information collected from each Notes application server 
■50-70 falls into one-' of four categories: log data, server tasks, e mail 
routing", ' and replication. For' each' category, the user can select one of 
the" options : ' - 

1) Collect and process all documents in that category (default); 

2) Collect all documents of this type now, but process them later; 
and, » ' ^ ' 

3) Discard all ' documents of this category. 

ir-w O^ ^The-user can^ cnlii^e"' the'action f or ^ category at any,. 

-t$me\ ^The-NbtSs^ maintains ''a time and date stamp to 

Record the 4 last : succesgful''jpoir of ^he "LOG.NSF file This stamp is 
recorded >ih trie N6TES^^i' ; f iie hi'.' If one of the tasks ''913-916 is 
started with eithef^no' or an 1 invalid time stamp, the Notes collection 
agent 912 will create it before processing any data. The parameters used 
by the Notes collection agent ^12 can be viewed and configured via each 
oil the Notes servers 50-78. 'statistics can also be recorded in the 
STATREP database and routed to the DSM server 10 for processing as also 
represented by mail flow 110 in Figure 5." 

cc:Mail Collection Agent 

Application server 40 acts as a cc:Mail Post Office router. The DSM 
server 10 appears to application server 40 as a peer Post office via the 
cc:Mail MTA 16. Application server 40 further comprises a cc:Mail 
collection agent. The cc.Mall collection agent and the cc : Mail MTA 16 in 
combination enables cc:Mail log files to be mailed by application server 
40 to the DSM server 10. The cc:Mail collection agent is similar in 
function to the Notes collection agent 912 hereinbefore described. In 
operation, the cc:Mail MTA 16 and cc:Mail collection. agent cooperate in 
gathering cc:Mail router log data from application server 40 and in 
•routing such log data to the 6sM server 10 without interrupting normal 



Router function. This process operates as a cc:Mail., call, list entry 
through which cc:Mail logs are collected , and supplied^ to the DSM server 
10 at predefined intervals. This enables the DSM .server., 10 to process 
these log files off-line from the cc:Mail message router, service provided 
by application server 40. .. . 

. _ ■ ' j •• • * ' "" 

DSM server Functions 

Referring now to Figure 7, in operation, r( the DSM T , server 10 acts as 
a mid-level system manager for managing the . activities of - the application 
servers 40-70. To facilitate such management the first layer 11 of the 
DSM architecture includes the following Process functions 

a) a Monitor function 540; 

b) a Server Probe function 550; 

c) an E Mail Probe function 560; and 

d) a Database Replication Tracking function- 570., 

The functions 540 -570 are per forced, by the .CPU-220 when configured 
by corresponding software r 3yQ ,;of the DSM; 
server 107 ^ will ^ Q f the 

present invention, similar functional ity t ,may .b ( e proyicaed; by hardware or 
by a combination of hardware and software. . t .... :-<-~ : , 

Referring back to Fi!gure^2, .the .log files maintained. by the 
application servers 40-70/ are greeted to the DSM server 10 as generally 
represented by communication paths. 80. The log files are stored by DSM 
server 10 in mass storage 230. The log file corresponding to each 
application server 40-70 is stored on a separate disk of mass storage 
230. 

Returning to Figure 7, on receipt of the log files, the Monitor 
"function 540 of the DSM server 10 filters the messages contained in each 
log. The filtered messages are sent from the DSM server 10 to the 
terminal 20 for display to an administrator : as, ; . generally represented in 
figure 2 by 90 . The Server probe function 550 and the Mail probe function 
560 automatica lly operate selected ones of the application servers 40-70. 



The integration of the functions 540-570 within the DSM server 10 
enables the DSM server 10 to .analyze message log files produced by the 
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- application -servers 40-70; identify operational problems within the 
application server 40^-7 0; identify message routing problems within the 
environment; and, store the log files of the application servers 40-70. 

Each function-/ 540-570 -is associated with one or more of a plurality 
. ■ of -inputs "bles.-50G-530'; Specifically; input ; table 500 is associated with 
: : -ei-ver pro,- - funcUioi- S30 and^the e mail probe function 560; input table 
. , -510, assc : a".ed with\ th6 monitor function' 510; input table 520 is 
^, . ^srpciat^ t'the: server probe -function 550; and input table 530 is 
associated wit;L d^ta: base replication tacking' function 570. The outputs 
of the functions 540-570 are supplied to database 580 stored in the mass 
. r . storage 230 and to-the acini list rat ion terminal 20. The data base 
. replication tracking -function 570 also monitors data' bases 610, 600, and 
5 90 . stored in the .k^ss vstcrage 2'30\ The input tables 500-540 contain 
, { 15 e: thresholds^- and parameters specified bj> system management staff. The 

b i thresholds and pararoetc rs '"relate 'to boundaries ' or levels of service 

agreed with an end user. For example, if mail is expected* to be delivered 
within 3 minutes under a sendee level agreement , this level will be set 
^^^^^^^^ ^' ^ '^yiripixt ^"e^SOj'^or^input to the e mail. 
3 ~ ' VXivery ^xcieds ' this time/ e mail probe 

■ - r ■ this' -incident is "then Vecorded in an 

.7.*- L-^ P u 5#Mt- report storage 230 ;' ~ 

Monitor function / . ' ; . ?. r ' ' - 
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. Cpnventicnal-iy, log' tiles irom application Servers are seldom 
- thoroughly . examined and acted v.pbn in a practical" environment because, as 
mentioned e«.rlier, such activity is time* consuming when performed 
manually. Referring back to Figur* 1, as mentioned earlier, all server 
log files are routed froawth* application servers 40-70' to the DSM server 
10. The DSM server 10 comprises a -knowledge base stored in the mass 
storage 230. The knowledge base enables the monitor function 540 of the 
DSM server 10 to analyze the contents of the log files received from the 
application servers 40-70 and decide what action, in each case, is 
appropriate. The analysis is a continuous, "ongoing •activity, with the DSM 
server 10 acting upon the information analyzed by the monitor function 
540 on behalf of the administrator; filtering and relying messages from 
the application servers U0-'7t' to the administration terminal 20. The 
monitor function 540. can- be configured' to filter out only selected items 
from,, the log files for transmission -to ; the adminstration terminal 20." 
, ;! This feature, pan be /employed to' prevent trivial information from reaching 



the administration terminal 20, thereby, allowing administration staff to 
react more swiftly to. cr it ical\ messages < \ 

Error messages contained in the log files from the application 
servers 40-70 (eg: communications/ rppt erv security, resource, and server 
environment error messages) - are : capture?*: Und re^orte.d • by the DSM server 
1.0 via Notes mail, Simple N$ twork .Management; Rr^toco^L^fsmP) trap 
Protocol Data Units, (PDUs) , ; and logging; to .^ri:es database . Control of 
the application servers 40?.?p- can ,be. passec- ;*j*redef iSied user exits with 
the DSM server , 10 when the above alerts are:: processed: • 



if Each application server 40-^0 corresponds to a different dedicated 
in - the mass storage 230 qf ;the £>SM -server .10'. The dedicated disk is 
employed to. record all information relevant to the corresponding 
application server 40-7 Q ;V Specifically/: the. information is organised by 
, the . DSM server ; 10 in a standard : f q^afe with sub-directories named DATA, 
STATUS, and.REPORTS>. v;t/ , za ^ u - ^ a ^ r-.-;\j: 

,, v ,.. a _, -i^ , ;r-,:.,o-i(}^ i^v--;. 'j:-.;.vj'.»? & ss^Tf .^JViUiTi t. J" 

.* :-i ; J~y S *$^^r* t ^& ^ ^^^c,e^^0 and stored in 

J^-!^ f ---^ -^itf wishing 0 ^ 

review the .lates^ ftc£ iv^ies* c '^^h; ^^^^2^^0-70 . A 
Graphical ^ ser - I^^£x,f p.qe : ^ QW*? )^?^fl^fecb < byv&ot^s^ 4 enables administration 
staff to view all information collected by the DSM server 10 from the 
application .servers 40-70 via a database naTVigatfcjL v>: 

, Information, from server lpgso and statistics ^ may be summarised on a 
weekly and monthly basis to provide administrators with information to 
manage present data, processing requirements and plan for future demands. 
The information is held, on, a Notes database in the form of Notes 
statistics, operation system statistics, network statistics, and response 
time summaries.. The application server log files and statistics can be 
archived via the DSM .sejp'er 10-automatically. : on a monthly 1 basis to a 
desired, destination. . ; 

Server Probe Function , 

Xhe server, probe function 550 monitors; via- the Notes collection 
agent 912, each Notes application server 50-70 to ensure activation of 
the Notes tasks 913-9JL6 specified in the NOTES V INI file of each Notes 
application server $0-70 Tasks. 913-916 which are not running (perhaps due 
to failure) are started, automat ically ;by the server probe 1 - function 550. 



In addition/ the server probe function 550 records and summarises, via 
the Notes collection agent 912 in each Note application server 50-70, 
'■response times from the DSM server 10 to the Notes , application servers 
' 50-70 on a daily basis. 

If any of the Notes application servers 50-70 goes off-line for any 
reason, the server probe function 550 will raise a severity 1 alert. The 
severity 1 alert is sent by the DSM server 10 to the administration 
terminal 20. The server probe function 550 continuously checks, via .the 
Notes collect ion agent 912, that the specified tasks 913-916 are active 
and functioning correctly. In the event of a problem with any of the 
tasks 913-916, the server probe function 550 will automatically attempt 
to restart it via the Notes' collection agent 912. After a predefined 
number of ' failed attempts to restart a particular task 913-916, the< 
server probe function 550 routes an alert to the administration terminal 
• 3 <h 'Failure of the host^ application server 50-7Q is. recorded by the DSM 
server 10 both within" and beyond committed service time. 

, ... --me^ryer probe" function 550 wi}l now be ..desci-ibed with reference 
^-lie^low^cte^^ In operation^ the . S; erver^robe function 

•5Sc^is S ues 'coinm^nds : in Not^s protocol to ^a 'tar^t. application server 50- 
•-•G-: -The' dommandl issued by' the server .probe function 550 ere handled 
•wiUiTn bote's 'sVo by the Notes collect ion agent 91,2 of . .the -target 
application server 50-70. Initially /'at block 1000, the server probe 
function 550 reset a restart 'count t 'o zero. Then, at block 1010, the 
server probe function 550 sends a -show task configuration- command to 
the Notes collection agent 912. The "show task configuration- command 
captures from the NOTES . INI file' in the target application server 50-70 
the tasks 913-916 which should be active. At block 1020, the server probe 
function 550 sends a -show active tasks" command to the target 
application server 1 50-70. The -show active tasks- command. .captures the 
tasks 913-916 which are active on the target application server 50-70. At 
block 1030, the server probe function 550 compares the task configuration 
with the active tasks. If the task configuration is the same as the . 
active tasks, then the server probe function 550. terminates at block 
1080. If the task configuration is different from the active tasks, 
indicating the one or more of the tasks 913-916 has, failed, then, at 
'block 1040, the server probe function' 5S0 determines if the restart count 
equals a predetermined threshold of attempts to, restart the failed tasks. 
If so, then, at block 1040, the" server probe function 550 issues an alert 
message for supply to the administration terminal 20. If not, then at 
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block 1050, the server probe function increments the restart count and, 
at block 1060, attempts to restart those tasks specified in the task 
configuration which were reported as inactive. The server probe function 
then continues around the loop defined by blocks 1020, 1030, 1040 1060, 
and 1070 until either all required tasks are active or the threshold is 
exceeded. 

- E-Mail Probe Function 

The E Mail probe function 560 tests mail routes in the network of 
'Notes application servers 50-70 by measuring the. time taken for a test 
message in t'he form of a Lotus Notes document to complete a return trip 
'to a reflecting server 50-70 against" predefined thresholds. An example of 
a' test report produced by the E Mail probe function 560 is provided in 
Appendix A hereto. '.The E Mail probe 7 function 560, generates an alert if a 
'threshold is exceeded. Additionally, the E Mail probe function 560 
generates reports including elapsed time across. each E, Mail application 
server eh 'route. Specif icaliy, the E Mail probe acquires the local date 
and time from each server 50-yo both on entry and„e*it. The entry and 
' exit date and time "for each server are .recorded,, in -thec^c^.es dociiment": 
forming "zhd ' test 'message . - in any ^mai 1 ap^ca^on n it;. ;is . important for 
-adminfstraeioVstaff to toow^^t^are any jail delivery problems and 
*the^time : taken to 'deliver the 'mail'. The TS Mail grobe function converts 
WVbblems arising in the 'Notei ma^l ^«^rk injbo alerts for forwarding 
'•to th-e adminstratiori terminal 20. The E MaU probe function 560 also- 

autdmaticaliy generates Mail tracking reports. f 

' . ; ' . v. ; ;■ ' ' : 

•' • ■•':>:';'.' 

Replica tion Tracking Function 

In some Notes applications it is important that data.. stored in 
databases is shadowed between different application servers 50-70. Such 
shadowing can be achieved via the Notes replication task . 914. The Notes 
replication tracking function 570 of the DSM server 10 checks databases 
on the application servers 40-70 to. establish if they are synchronised 
after- a Notes replicator task 914 has been executed by two or more of the 
Notes application servers 50 to 70. If the^databases are out of 
synchronisation, the DSM server 10 sends an alert to the administration 
terminal 20. The Notes replication tracking function 570|verifies that, 
after Notes replication server activity has occurred, . databases of the 
same- replica ID have the same contents . 
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r.-: . B y. way^P.f ex,=aEple vith reference to Figure 1, suppose a database 
.ABC.NSF; is, .stored, ojr. application server 50 and replicated on application 
server. SO via the .replication tracking task 914 . Hence, both application 
server 59- and application' se:-?ver 60 '■ hence store a "copy a ABC.NSF. Suppose 

..now ,th$$- : § clis3&*isersctosoct'ed to application server '50 is modifying 
the , copy, of ABC, NSF storsi.on^ application server 50 and, simultaneously, 
a client user connected to application server 60 is modifying the copy of 
ABC.NSF stored on application server 60. The replication task 914 on 

n application .^periodically replicates the modified copy of ABC.NSF on 
application -^rye* , 6.0-, Likewise, the replication task 914 on application 

..$Q. periodically; replicates, the' modified' copy of 'ABC.NSF on application 
ser Yer. ; r,0. The frequency . at. which replication takes place can be preset 
«cco 5 dinff..to user needs . - For .example, if the database contain relatively 

^important information which is frequently modified by client users, then 
cprrespondinr-ly frequent Replication Activity might be appropriate. " 
Conversely,, .-if the ; infcr.T ration 'contained' iiv the database in less 

^mpqrtant^, xeplicatioii.^y t- set to -take place ; less frequently. From a 

.,5 n»WJ^nfc, perspecti ve,. ;it A-culd be clesirable to 'ensure thac: 
replication t* on application servers SO and 60 are set to perform 
replication of .^.HSFn^ the ^ 

regularity with Wch end user clients of application servers 50 and 60 
independently : modify , ABC ,NSF^ t , • , problem it, -solved by the replication 
tracking functiiOn £7-0 of. .the -DSf' \ Vr '10. 1 - '-■ 

Referring to Figure. ;),, in operation, the replication tracking 
function 570 is initialised, at block 1110, by setting a SAMPLING 
INTERVAL to the desired number of sampl.es of t'he 'copies' of ABC.NSF on 
application servers 50 and 60 to be taken; by setting a TARGET COUNT to 
the number of matches in tho copies of ABC.NSF on application servers 50 
and 60 t,p be found in the, sampling interval; and, by resetting running 
totals HIT C0,UNT and SAMPLE. COUNT to zer6. At block" 11 w"/ the replication 
tracking function 570 samples, both ABC.NSF stored on application 50 and 
ABC.NSF stored on application server 60. At block 1120, the replication 
tracking function 570 compares the two samples. If the two samples match, 
then, at block 1130 the replication-tracking function 1 570 increments HIT 
COUNT and progresses to block,1140. If the two samples do not match, then 
the replication. tracking function progresses directly to block 1140. at 
which SAMPLE COUNT is incremented. At block 1150. the replication 
tracking function- 570 compares SAMPLE -COUNT 'with SAMPLING INTERVAL. If 
SAMPLE CQUNT does not equal SAMPLING INTERVAL, then the replication ^ 
tracking function 570 returns to block 1110 to collect the next pair of 
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15 



25 



30 



35 



40 



samples. If the SAMPLE COUNT ha<3 reached SAMf>tlNG' INTERVAL, then the 
replication tracking^ unction 570; fconnpire^, ^ae blo^k 1160 , HIT COUNT with 
TARGET C0yNT. ( lf HIT. COUNT i sL : 1 e s s . ^ than 1 TARGET 'COUNT' Uhen , at block 1170, 
the replication tracking function 570>: issue's T an 'alert to the 
administration terminal indicating ^that-^he' copies- of ABC.NSF stored on : 
application, server^ 50 and. 60*are <not synchronised^ Otherwise, : the 
replication tracking function 570 ^ettttitfates **' - r '?--i' 



*>: • i Zt V?: 11 be appreciated that- thei replication' tracking function 570 
, ; may be employed to track replication^ of i-rtfeS tfian'bne^dat abase. Equally, 
^ it will .be. appreciated that the replication' tfackiri^^f unction 570 may be 
employed to track replication, of -more thkn two copied of the or each 
da : tabas^. Furthermore, ,it wi i 1 ' be ?. apprecia t ed thdt 'the : replication 
tracking function ,570- may apply -different ' test parameters (eg: SAMPLING 
re yq^ ERVA ^. :.^GET,cqpNTi t.ojdif f erent databases -or Soup's of databases. 
../"A^^vA^^^^'^'At VAU *>e associated that the samples forming the 
-.or?. .-, interval- .may bej< taken i©ver a ^re^labively'-feho^t'^eriod' of time or 

o-.: i >,iB°?? r Bii^i 1 ^^ 'Iwg period joficfctaf dfrfrattfiQf' oh 'cWt!oW requirements. 
irr ^y> r : } , U 3 - u -.. c? baa Cc i-w*^ .loi^/f^i* ^:^jU noi iq^'t 

20 „: j Applic J a t l io ^,: s fWr- T ^T Hata -^>^oe^^ X'Ap.o ^/ -^^qo-i . « 

,V ai:/.^ x.or,--r . *o -i-y ■ .! .»".. ri-* J 'i* v>.' 

/tfc:/ t r . , V ;? e f au sf, „;Ln. t embodmfcnts; of. tftfe present* irtve^ion," the log files are 
retained by the-mas^ ,.^ge -2*0 <5f the tiftf Server ifr rather than by the 
application servers'" -70, the application servers 40-70 are able to 
. d 1 d ?i cate T°? e "source $o client activities;' :J : } 

. v . PSM Ser ver Hierarchy and:- Scalability i; * ' 1 ' J 

. r The ; DSM : server 10 can operate as a module. Therefore, multiple DSM 

• !: r S ®5 yer 3 rbe employed to accommodate 'a- -large number of different ' 
, application servers, with each DSM server serving a different group of 
.. .application server. Referring now to Figure 5, in an example of such an 
r arrangement, there .is provided a - plurality of application servers 400-450 

_ an<i a plurality of DSM servers 460-480. Each DSM server 460-480 is 
, :: connected to a, different group of applicati6n servers 400-450. The DSM 

servers 460-470 are each connected to a DSM master ' server 490. In 
. ^operation, all MTA -communications with t He application servers 400-450 
t; are handled by the ♦ DSM servers 460-480, ; but the D^M servers 460-480 
, co ^nicate with the master MM-server 490 via the Notes MTA alone. In 
, ..the example of the present- invention hereinbefore described with 

reference, to Figure 1, there are' 3 T DSM servers 460-480 .' However , it will 



be appreciated that, in other embodiments of the present invention, there 
may be only two DSM servers reporting to the master DSMi.server 490. 
Likewise, in other embodiments of the present invention, there may be 
grater than three DSM servers either reporting to the master DSM server 
„ 49 ? or . on 'e <?r more further layers of • intermediate DSM servers 
. arranging in a hiera rchicar j structure ending at the master DSM server 



Typically,! operational centres are consolidated .into one or two 
yeojgrap^icai -.areas in ; the, interests, of cost . iThia means that from 
..5 elati . Vely few operational centres,, systems management control is 
exercised over relatively large regions. The arrangement shown in Figure 
.5 is particularly suitable -tor this scenario: 

By way of summary then, what has been hereinbefore described by way 
of.. example cf the present invention is/a server probe, for use in a ' 
! distributed ^tjaVRrooee^ingjsystet.; having plural application server 
Computer, .sy^tems^ihtqrcor^ application 

1 ^* a %* ^^H^Ua. k^luraiitylofJ executable 

; >ta~:.oaoagement -tasks and «an. : initialisation table •storing k task 
^??: fi .?« ra tipn .inqicatjiv~... 0 f. ^ ^ act ive. The 

•sjerver probe reads; the trf*k ; Un J £i^ati^ from the ' initialisation table 
*r<**-* J*fS^_ one' 'ff "the eppiication^ser^rs:,, . identifies the tasks which 
> re . a = tive . » the.tar&et..ope.of the application' servers, and generates an 
event, message if Active tasks identified differ from the tasks specified 
ih the initialisation table'. " . ' 
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APPENDIX A 

DSM' Mail Probe Status Information for 
D06ML002/06/M/IBM. ' : '"' :> ' ' : 



! Basic Information 



i 



Mail Probe Reflector, Name 


Public/UK/ IBM@ IBMGB@GBLPO 000 


Time Taken For Mail Probe to Return 


957 seconds 


Number of Server (s) Mail Probe Was 
Routed Through .-. ■;. ..t 


12. 


.> Domains That The Mail Probe Passed ■ ' 
Through 


iBMdl ; UKGNAMIG , GBLPO 0 0 0 ' 


i — _ 

• Detailed Information 1 . 

• ; " ' . _ 


, ( 

1 

, 1 ....... ■. 1 



» denotes times for Mail Probe on its journey to the Reflector 

« denotes times for Mail Probe on its journey back from the Reflector 



tfoVes Server Name 

lJ.-i-jCj>i ! Ml 'V; 


Time Hail Probe 
Entered KAIL.. BOX Ob- 
server 


| Time toil Probe , , J, Time Mail Probe 
Left M&IL.-BOX' On ' * Spent At Server 

- Serve*... ...,-=. ■ ^ I 




12-02-97 04:34:24 

.-PM£ ? f-1 ' : 


.. 1^02-97 ,01:^4 raSjvi- 


1 second (s) .» 


GBLPH401/GB^P^4 


: M*. 02*97 fOV: 37 :^52 
PM 


i2-02-9f "oV-Vl^s 


5 : 

943 second (s) .» 


'dBMP6628/LCS 


12-02-97. 04:47:.12 


1.2^027,97 0^:^7:15 V£> 
£>M " " ' 


z3 second (s) .» 


JDSICLNPP3/UKGNAMIG 


"12^02-97 -04 :50:"47 ; " 
PM 


f s.c 'J JS. 
12-02-97 04:50:49 

■ :: PM f 1- i V, _ 


2 second (s) .» 


D06HUBM1/06/H/IBM 


12-02-97., 04:47:35 
PM 


: 12 : -02-97^04^47r36" 
PM 


: 'l second (s) .>> 


D06ML002/06/M/ABM 


12-02-97 05:51:^14 
PM 


12-02-97 05:51:15 
PM 


1 second (s) .» 


D06ML002/06/M/IBM 


12-02-97 05:51:15 
PM 


12-02-97 05:51:15 
PM 


0 second (s) .« 


D06HUBM1/06/H/IBM 


12-02-97 04:47:37 
PM 


12-02-97 04:<7:38 
PM 


1 second (s) .« 


DSMLN 003/ UK GNAM I G 


12-02-97 04 :50:51 
PM 


12-C2-97 C4:50:53 
PM 


2 second (s) .« 


GBMP0028/LCS 


12-02-97 04:47:21 
PM 


12-02-97 04:47:22 
PM 


1 second {s J .« 


GBLPR40 1 /G3LPR4 


12-02-97 04:53:44 
PM 


12-02-97 04:53:48 
PM 


4 second(s) .<< 


GBLPR403/GBLPR4 


12-02-97 04:50:21 
PM 


12-02-97 04:50:21 
PM 


0 second (s) .« 
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CLAIMS 



1. Server probe apparatus for a distributed data processing system 
Having plural "application server computer systems interconnected via a 
network, each application server having a database application including 
a plurality of executable data management tasks and an initialisation 
table storing a task configuration indicative of which ones of the tasks 
should be active/ the apparatus comprising: 

means' for reading the task configuration from the, initialisation 
table from a target one of the application servers; 

means for identifying the tasks which are active in the target one 
of the application servers? and 

means for generating an event message inactive tasks identified 
differ from the tasks specified in t:he initialisation, ; tab.le. 



2. Apparatus as claimed in claim \, wherein the generating means';; 
^coSiprlsSs restart means 2bri«king"8uccess,iye at.^mpts .to. restart tasks 
which are specified in 'the initialisation table but inactiva in the 
target one of the application servers and means for generating the event 
message after a predetermined plurality of failed attempts by the restart 
means to restart the or each inactive task. 



3. 



Apparatus as claimed in claim 1 or claim 2, comprising 



an 



administration terminal having means for displaying the event messages. 

4. A distributed data processing system having plural application 
server computer systems interconnected via a network, each application 
server having a database application including a plurality of executable 
data management tasks and an initialisation table storing a task 
configuration indicative of which ones of the tasks should be active, and 
server probe apparatus as claimed in any preceding claim. 

5. A method for managing data management tasks in a distributed data 
processing system having plural application server computer systems 
interconnected via a network, each application server having a database 
application including a plurality of executable data management tasks and 
an initialisation table storing a task configuration indicative of which 
ones of the tasks should be active, the method comprising the steps of: 
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10 



reading the task configuration from the initialisation table from a 
target one of the "application servers; 

identifying the tasks which are active in the target one of the 
application servers; and 

' generating an event message if active tasks identified differ from 
the tasks specified in the initialisation table. 

6. A method as claimed in claim 5, wherein the generating step . 
comprises': 



making successive attempts to restart tasks which are specified in 
the 'initialisation table but' inactive in the target one of the 



15 application servers; and, 



i ! "^ C " '7* " "'^ en ^ ing ; the ' veni f essa Kf ft?r a P^ m j^ Polity of 
• ? - l £aaled-atte'itipts to*" restart the or each inactive task . 

1\ V "." .~ ,f S .!i aimed J^^? f .cjaim. |, co^pris^ng displaying 
• the eVe : Ki message 'on a display device. 
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